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Abstract 

Background: A LEP transcript up-regulated in lungs of ducks {Anas platyrhynchos) infected by avian influenza A 
virus was recently described in the Nature Genetics manuscript that reported the duck genome. In vertebrates, 
LEP gene symbol is reserved for leptin, the key regulator of energy balance in mammals. 

Results: Launching an extensive search for this gene in the genome data that was submitted to the public 
databases along with duck genome manuscript and extending this search to all avian genomes in the whole-genome 
shotgun-sequencing database, we were able to report the first identification of coding sequences capable of encoding 
the full leptin protein precursor in wild birds. Gene structure, synteny and sequence-similarity (up to 54% identity and 
68% similarity) to reptilian leptin evident in falcons (Falco peregrinus and cherrug), tits (Pseudopodoces humilis), finches 
(Taeniopygia guttata) and doves {Columba livid) confirmed that the bird leptin was a true ortholog of its mammalian 
form. Nevertheless, in duck, like other domestic fowls the LEP gene was not identifiable. 

Conclusion: Lack of the LEP gene in poultry suggests that birds that have lost it are particularly suited to domestication. 
Identification of an intact avian gene for leptin in wild birds might explain in part the evolutionary conservation of its 
receptor in leptin-less fowls. 
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Background 

The duck {Anas platyrhynchos) genome and trans- 
criptome were recently reported in Nature Genetics [1] 
as part of an investigation of immune-related genes im- 
plicated in the response to infection by avian influenza 
virus A. Using deep sequencing, the authors compared 
the lung transcriptomes of control and H5N1 -infected 
ducks and used the gene symbol LEP to describe a 
transcript that was upregulated in the infected ducks. In 
vertebrates, this gene symbol is reserved for leptin, the 
key regulator of energy balance in mammals; however, 
the avian ortholog has never been established. 



Leptin in poultry research 

An entry in the Gene database [Gene ID: 373955] is set 
aside for the chicken {Gallus gallus) leptin gene. The 
lack of a nucleotide sequence for this entry reflects its 
complex history, having been cloned and its sequence 
then retracted [2-5]. After removing the Bos taurus 
sequences that contaminated the first submission of the 
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chicken genome project and the EST database [6], it was 
finally established that no close ortholog of mammalian 
leptin is present in this genome. However, the obvious 
importance of identifying a master gene that controls 
appetite and fattening in poultry promoted cloning of 
mammalian-like leptins in turkey {Meleagris gallopavo, 
[GenBank: AAC32381], 95% identity to mouse leptin) 
and duck ([GenBank: AAT38807], 99% identity to mouse 
leptin). In the turkey genome housed in the ENSMBL 
database, there are neither annotations for LEP nor 
murine-like leptin sequences in its build; hence, like 
chickens, turkeys lack leptin. 

Results and discussion 
Synteny confirms leptin in birds 

The typical structure of the leptin gene include 3 exons: 
a non-coding exon followed by large intron 1, a second 
exon harboring the translation-initiation codon close to 
the splicing acceptor site, and a third large exon that 
encodes most of the protein (e.g. human Gene ID: 3952, 
fugu, Takifugu rubripes, Figure la). Based on a short 
contig (1482 bp) of a whole-genome shotgun sequencing 
(WGS) project, a partial gene capable of encoding the 
third exon of a leptin-like protein has recently been 
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Figure 1 Synteny, shotgun assembly, and sequence alignments of avian leptin-like genes, (a) Comparison of genomic region of fugu 
(Takifugu rubripes) LEP based on the NCBI Sequence Viewer display, and falcon contig assembly [GenBank: AKMT01 01 8335-6, AKMU01 055766-7], 
Gene structures are drawn to scale shown by the bar below. Black and gray boxes represent translated and untranslated regions (UTRs) of exons, 
respectively. When the exact transcription termination site is not characterized, large gray arrowheads at the 3' UTRs indicate the direction of 
transcription, which is generally indicated by small arrowheads on the intron delineations. Gene identification and exon numbers are given above 
and below the gene depictions, respectively. Exon numbers for falcon RBM28 follow their numbering in the orthologous rat gene, (b) Identification of 
errors in the genome submission of falcon based on alignment with individual reads from Sequence Read Archive (SRA). Reads were located by 
BLASTN search of the SRA database, downloaded with their quality information (FASTQ format), and assembled using GAP5 software [8]. The relevant 
protein sequence is added above the contig editor output for a region of low coverage within the second LEP exon. The contig editor shows quality 
values by gray scale and discrepancies between the sequences and the consensus are highlighted by a base symbol. The cutoff option was not 
turned on and therefore low quality (dark gray) bases that were manually trimmed are not displayed. Individual reads and the mapping template 
(AKMU01 055767) are identified on the left. A base substitution and 4-base deletion (A****) are denoted on the mapping template, which is the first 
read below the consensus line, (c) Amino acid sequences of leptin-like genes identified in the WGS database of birdsrtit (Pseudopodoces humilis, 
[GenBank: HG425120]), dove (Columba livia, [GenBank: HG425123]), falcons (Falco peregrinus, [GenBank HG425121]; and f. cherrug, [GenBank: 
HG425122]), and finch (Taeniopygia guttata, [GenBank: XP_004 175839]) were compared with turtle LEP {Chelonia mydas, [GenBank: KB475412]). Box 
coloration follows the legend of Figure 2. 



annotated in Taeniopygia guttata [Gene ID: 101233729], 
suggesting that leptin is expressed in the zebra finch. A 
BLASTP search using the putative 115-aa polypeptide 
encoded by this exon against the NR database indicated 
that the reptilian green sea turtle {Chelonia mydas) lep- 
tin was its closest ortholog, with 34% identity and 52% 
similarity (Figure lc), while mouse leptin was more distant 
with 31% identity and 47% similarity. We hypothesized 
that if leptin is indeed present in birds, it would have been 
revealed in other avian WGS projects. Indeed, TBLASTN 



against the WGS database with taxid restricted to Aves 
revealed that the gene may be present in falcons {Falco 
peregrinus and F. cherrug, [GenBank: AKMT01018335 
and AKMU01055767], respectively), tits {Pseudopodoces 
humilis, [GenBank: ANZD01014665, ANZD01014667]) 
and doves {Columba livia, AKCR01028475). The falcon 
sequences were 99.6% syntenic [7] and we therefore 
assembled the contigs of both species together (15,541 bp, 
also including [GenBank: AKMT01018336 and AKMU 
01055766], Figure la) using GAP4 and 5 software [8], and 
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incorporating additional reads in critical regions from the 
Sequence Read Archive Nucleotide BLAST (Figure lb). 
This revealed coding exons fitting the typical leptin gene 
structure and capable of encoding a full-length leptin-like 
166-aa precursor with 52% identity and 69% similarity of 
F. cherrug to the turtle leptin (Figure lc). Moreover, the 
3 -neighboring gene of the falcon leptin showed 56% iden- 
tity and 68% similarity to rat RNA-binding motif protein 
28 (RBM28, [Gene ID: 312182]). Local LEP-RBM28 syn- 
teny is conserved and observed in fish (e.g. fugu, [Gene ID: 
101064097]) and mammals (Figure la), and thus strongly 
indicating that these sequences are orthologous to the 
mammalian leptin. 

The tit contigs were GC-rich (68%) with highly repeti- 
tive GC elements and we were unable to combine them; 
nevertheless, both coding exons corresponding to the 
typical leptin structure were observable. These exons were 
capable of encoding a full-length leptin-like 161-aa pre- 
cursor with 36% identity and 57% similarity to the turtle 
leptin (Figure lc). Further search of the WGS database re- 
vealed similarity to single exons: the previously annotated 
exon 3 for zebra finch and a novel match to exon 2 for 
dove. We extended the detected dove contig with reads 
[SRA: SRR511892.31385855, SRR511913.3134902] and 
found the initiation codon of a typical structure of leptin 
exon 2 capable of encoding 48 aa of the 5 ' end of a leptin- 
like precursor with 56% identity and 72% similarity to 
turtle leptin (Figure lc, [GenBank: HG425123]). Evidence 
for expression of this exon was provided by a single read 
derived from racing-liver RNA-Seq library prepared from 
poly-A + RNA [SRA: SRR521362.22831237.2]. This read 
represented a spliced transcript of 2 non-coding exons 
that preceded the first coding exon, in agreement with the 
sequence of the introns and canonical splice sites evident 
in the genomic submission (GenBank: AKCR01028476). 

Leptin remains unidentifiable in domestic fowls 

Examination of the recently submitted duck genome anno- 
tations revealed no gene with LEP as its symbol and no 
gene annotated as leptin. Moreover, BLASTN search of the 
WGS database using "duck leptin" [GenBank: AAT38807] 
or any of the novel leptin-like bird proteins described here 
indicated no significant similarity to leptin in this genome 
submission. Thus, we conclude that this gene may be also 
missing in duck. It is expected of the of the editorial 
process of a high ranking journal to ensure that when see- 
king a fast impact, genome publications would not turn 
into lists of unverified gene symbols that no one actually 
reads. It is further recommended that authors who depo- 
sited erroneous sequences of murine-like leptins for birds 
in sequence databases [GenBank: AAC32380, AAC60368, 
AAL35557, AAT38807, 042164, 093416] caution users of 
the possibility of sequence contamination. It should be also 
noted that 11 GenBank mRNA submissions of fish leptins 



with >98% identity to the mouse transcript should be 
similarly annotated [GenBank: DQ784814-6, AY497007, 
AY547279, AY547322, AY551335-9]. 

Moreover, a large volume of misinformation may have 
been generated as these murine-like leptins were the basis 
for studies without prior knowledge of leptin's activity in 
the targeted species, including reports of the expression of 
the erroneous leptin gene product at the mRNA and 
protein levels (e.g. [9-16]). These leptins were reported 
to attenuate appetite, or affect other parameters related 
to the control of energy balance when administered to 
chickens [17-20], chick embryos [21-23], ovarian [24] 
and hepatoma [25] cells in culture or skeletal bones in 
an ex vivo model system [26]. 

Recent findings 

While this work was under consideration 3 reports 
describing avian leptins were submitted and published 
[27-29], including indication that is based on a single 
RNA-seq read for leptin-like transcript in the duck [27]. 
We used the sequence information from this read and the 
read from this fragment opposite end, to design a pair of 
PCR primers which bridged the sequence gap between 
these reads. The PCR protocol applied was adapted for 
amplification of leptin GC rich sequences [28]. DNA 
sequencing of the resulting PCR product confirmed the 
existence of leptin-like sequence orthologous to the 
sequence of the last exon of other avian leptins (Figure 2) 
in the duck genome. However, analysis of the genomic 
raw deep-sequencing data [BioProject: PRJNA46621] was 
hampered by existence of similar repetitive sequence struc- 
tures; and we were able to extend this sequence only to- 
wards the 5'. We detected no reads that could extend the 
3 ' with sequence coding for valid cysteine knot motif that 
is typical of all leptins [30] . Furthermore, analyzing the raw 
RNA-seq data [BioProjects: PRJNA194464, PRJNA188394] 
revealed transcription matching the repetitive sequence 
structures but no additional reads for the duck leptin-like 
sequence described here could be identified (data not 
shown). Detection of leptin syntenic genes like miR129-l 
favors the possibility that the leptin gene may also exist in 
ducks [29]. Hence, the existence of fully functional leptin 
gene in the duck remains an open question. 

Further BLASTN and TBLASTN searches of the WGS 
database using the novel avian leptin sequences revealed 
indications for existence of leptin in additional bird 
species. These include woodpecker, eagle and quails 
(Figure 2). Protein motifs typical of leptin were identified 
and annotated including leader peptide, 4-helix bundle 
structure and cysteine knot (Figure 2). While the leptin 
gene of woodpecker was apparent on an unplaced 
genomic scaffold [GenBank: JJRU01076739] the gene of 
golden eagle was much obscure. The eagle's first coding 
exon (exon 2) was intact in a WGS contig [GenBank: 
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Figure 2 The amino-acid sequence of mouse leptin (Mus musculus, [GenBank: NP_032519]) was aligned with leptin and LEP-Wke sequences of 
birds identified in the WGS and SRA databases: zebra finch (Taeniopygia guttata, [GenBank: AFK25168]); Tibetan ground tit [Pseudopodoces 
humilis, [GenBank: HG425120]); budgerigar (Melopsittacus undulates, [GenBank: AHZ86931]); falcon (Falco cherrug, [GenBank: HG425122]); 
golden eagle [Aquila chrysaetos Canadensis, derived from [GenBank: JDSB01 14351 1, SRR1 01 6445.84242652, SRR1 01 6445.377701 92, 
JDSB01 1 631 1 9, SRR1 01 71 48.401 89562]; dove {Columba livia [GenBank: HG797022]); downy woodpecker (Picoides pubescens, derived from 
[GenBank: JJRU01 076739]); northern bobwhite quail {Colinus virginianus, derived from [GenBank: AWGU01 372785]); Japanese quail [Coturnix 
japonica, derived from [GenBank: ERR1 25582.247893.2, DRR002300.424669919.1, DRR002301 .19253882.1, DRR002301 .1241 06485.1, 
DRR002301. 44847625.1]); and the mallard duck [Anas platyrhynchos, derived from [GenBank: SRR040307.6 134664, SRR797835.671 34665.2, 
SRR04031 6.492761 3, SRR797835.671 34665.1]). Dashes indicate gaps introduced by the alignment program. Identical and similar amino-acid residues 
in at least three or six sequences are indicated by a black and gray background, respectively. White boxes indicate non-conservative amino-acid changes 
between the proteins. The signal peptide and structural elements, helixes and loops [28] are denoted above the alignment. The two conserved cysteines 
forming a lasso knot [30] are indicated by black arrowheads. Duck's genomic sequence was confirmed using previously described procedures [28]; DNA 
was extracted from frozen mallard duck purchased from a local husbandry (Levin, Kfar Baruch, Israel) and nucleotide sequence was determined by capillary 
sequencing of the 81 bp product amplified using PCR primers (F, 5'-CAGGTTTCCAGCGCGTC-3; R, 5-GAGGTTCTCCAGGTCGCTTA-3'). 



JDSB01 143511]. However, de-novo assembly of genomic 
raw deep-sequencing data [BioProject: PRJNA222866] 
was unable to extend the sequence of the last-exon-like 
structure [GenBank: JDSB01163119]. Yet, all the putative 
motifs encoded by the highly (89% identity and 91% 
similarity) orthologous falcon leptin gene were assembled 
to form disordered palindromic and repetitive contigs 
containing also the leptin's syntenic gene RBM28. Such 
structures were also typical for the duck (data not shown). 
Bobwhite quail was the first galliforme with a partial exon 3 
like sequence observable in a contig assembly of the WGS 



effort of this quail (Figure 2, [GenBank: AWGU01372785]). 
We used this sequence as a template for a BLAST search 
of the deep-sequencing data deposited for the Japanese 
quail in the SRA database and the related WGS assembly. 
The leptin gene was not identifiable in the latter, however 
we were able to download and assemble the matching 
SRA sequence reads (Figure 2), which correspond to an 
intact exon 3 structure. We repeated the sequence 
searches against the chicken genome and confirmed that 
even this galliforme LEP-like sequence is not detectable in 
Gallus gallus, in agreement with the observation that 
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administration of a leptin antagonist had no effect on 
appetite and body growth of layer chickens [31]. We could 
not associate any ESTs or RNA-seq reads to the quails' 
leptin-like genes and moreover the role of the leptin 
signaling pathway may differ in galliformes [28]. This 
hypothesis may also be related to the finding that the 
hunger hormone ghrelin [32], which is predominantly 
synthesized in the gastrointestinal tract in chickens and 
mammals, has been reported to have an opposite effect on 
appetite in chickens compared to mammals [33,34]. 
Hence, galliformes provide a unique model system to de- 
cipher an alternative control mechanism of energy homeosta- 
sis and we intend to further study this in the Japanese quail. 

Conclusions 

The absence of a leptin gene in genomes related to 
domestic fowls seems incompatible with the presence of 
the leptin receptor gene, which has been cloned in 
chicken [35], turkey [36] and duck [1]. Herein we report 
the first identification of coding sequences capable of 
encoding the full leptin protein precursor in birds. Iden- 
tification of an intact avian gene for leptin might explain 
in part the evolutionary conservation of its receptor in 
Aves. The loss of leptin in the lineage of domestic fowls 
suggests that relaxing the control of appetite made these 
birds particularly suited to domestication. 

Methods 

Comparative sequence analysis 

For the characterization of leptin genes not yet annotated 
in the avian genomes assemblies, sequence homology 
searches were carried out in different, publicly available 
database (NCBI: NR, WGS, SRA; and Ensembl) using the 
BLAST family of programs. Relevant sequence entries 
were downloaded with their quality information (FASTQ 
format), and reassembled using the GAP5 software [8]. 
The amino acid sequences were aligned using CLUS- 
TALW (http://www.genome.jp/tools/clustalw/) with the 
default parameters and the GONNET matrix; and colored 
using the BOXSHADE program (http://www.ch.embnet. 
org/software/BOXform.html). 

Sequence data accessions 

The annotated sequences are available in GenBank under 
accessions HG425120-3. 
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